Power‐Law‐Based Synthetic Minority Oversampling Technique on Imbalanced Serum Surface‐Enhanced Raman Spectroscopy Data for Cancer Screening
نویسندگان
چکیده
Surface-enhanced Raman spectroscopy (SERS) has shown highly promising for existing cancer screening. However, previous “proof-of-concept” studies ignored the natural imbalance of types in population, leading model to be biased toward learning more features majority class during process at expense ignoring minority class. Herein, a power-law-based synthetic oversampling technique (PL-SMOTE) method is proposed guide resampling multiclass serum SERS data by analyzing long-tailed (power-law) distribution prevalence population. The PL-SMOTE balances number minorities resample and overlaps between classes introducing modulating factor. Modeling on resampled datasets synthesized verifies effectiveness method. After further fine-tuning, parameters deep neural network method, an optimal screening with macroaveraged Recall score 97.24% F2-Score 97.38% obtained. A new imbalanced provided, which significant improvement performance terms also inspires other scenario, such as biological medicine, abnormal detection, disaster prediction.
منابع مشابه
A Classification Model for Imbalanced Medical Data based on PCA and Farther Distance based Synthetic Minority Oversampling Technique
Medical data are extensively used in the diagnosis of human health. So it has played a vital role for physicians as well as in medical engineering. Accordingly, many types of research are going on related to this to have a better prediction of the diseases or to improve the diagnosis quality. However, most of the researchers work on either dimensionality space or imbalanced data. Due to this, s...
متن کاملRBM-SMOTE: Restricted Boltzmann Machines for Synthetic Minority Oversampling Technique
The problem of imbalanced data, i.e., when the class labels are unequally distributed, is encountered in many real-life application, e.g., credit scoring, medical diagnostics. Various approaches aimed at dealing with the imbalanced data have been proposed. One of the most well known data pre-processing method is the Synthetic Minority Oversampling Technique (SMOTE). However, SMOTE may generate ...
متن کاملA Synthetic Minority Oversampling Method Based on Local Densities in Low-Dimensional Space for Imbalanced Learning
Imbalanced class distribution is a challenging problem in many real-life classification problems. Existing synthetic oversampling do suffer from the curse of dimensionality because they rely heavily on Euclidean distance. This paper proposed a new method, called Minority Oversampling Technique based on Local Densities in Low-Dimensional Space (or MOT2LD in short). MOT2LD first maps each trainin...
متن کاملWEMOTE - Word Embedding based Minority Oversampling Technique for Imbalanced Emotion and Sentiment Classification
Imbalanced training data always puzzles the supervised learning based emotion and sentiment classification. Several existing research showed that data sparseness and small disjuncts are the two major factors affecting the classification. Target to these two problems, this paper presents a word embedding based oversampling method. Firstly, a large-scale text corpus is used to train a continuous ...
متن کاملAdaptive Oversampling for Imbalanced Data Classification
Data imbalance is known to significantly hinder the generalization performance of supervised learning algorithms. A common strategy to overcome this challenge is synthetic oversampling, where synthetic minority class examples are generated to balance the distribution between the examples of the majority and minority classes. We present a novel adaptive oversampling algorithm, VIRTUAL, that comb...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Advanced intelligent systems
سال: 2023
ISSN: ['2640-4567']
DOI: https://doi.org/10.1002/aisy.202300006